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ABSTRACT 


RECENT  RESULTS  ON  MULTI-STAGE 
SELECTION  PROCEDURES 
Klaus  J.  Miescke  (West-Lafayette) 


During  the  past  few  years,  several  new  developments  took 
place  In  the  area  of  sequential  selection  procedures.  The  purpose 
of  the  present  paper  is  to  describe  the  major  results  and  to 
point  out  some  open  problems  and  Interesting  questions  for  fur¬ 
ther  research  in  the  future. 

The  basic  goal  is  to  find  that  one  of  k  populations  which  is 
associated  with  the  largest  parameter  of  a  given  underlying  family 
of  distributions.  Additionally,  in  the  control  setting,  one 
wishes  to  decide  whether  this  parameter  is  large  enough,  i.e., 
larger  than  a  control  value.  A  major  topic  of  interest  is  to 
find  procedures  which  are  reasonably  economical,  i.e.,  which 
perform  well  without  requiring  too  many  observations.  The  tra¬ 
ditional  criterion,  due  to  R.E.  Bechhofer,  which  is  to  guarantee 
the  probability  of  a  correct  selection  outside  of  a  certain  indif¬ 
ference  zone,  combined  with  the  criterion  of  keeping  the  expected 
total  sampling  amount  small,  constitutes  the  main  stream  of 
current  research.  On  the  other  hand,  some  work  has  also  been 
done  in  the  decision  theoretic  approach,  but  due  to  the  inherent 
analytical  difficulties,  the  results  are  rather  incomplete  up 
to  now.  One  promising  direction  of  further  research  appears  to 
be  the  construction  of  procedures  which  are  not  too  complicated 
and,  at  the  same  time,  are  at  least  approximately  optimum  in  a 
decision  theoretic  sense. 
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RECENT  RESULTS  ON  MULTI-STAGE 
SELECTION  PROCEDURES 

K.  J.  Miescke* 

Purdue  University 

1.  Introduction.  The  problem  of  how  to  find  the  best  (in  terms  of  a  distri¬ 
butional  parameter)  of  k  >2  populations,  by  means  of  observations  drawn  from 
them,  has  been  studied  by  many  research  workers  in  the  past.  A  thorough  intro¬ 
duction  into  the  field  of  Ranking  and  Selection  as  well  as  a  complete  overview 
over  all  relevant  developments  up  to  1979  is  provided  by  Gupta  and  Panchapakesan 
[16].  The  bibliography  contained  therein  can  be  complemented  with  the  help  of 
Dudewicz  and  Koo  [7]. 

The  purpose  of  the  present  paper  is  to  survey  recent  results  in  the  sub¬ 
field  of  sequential  or,  respectively,  multi-stage  selection  procedures  which 
are  not  already  discussed  in  [l£J.  Moreover,  several  open  problems  will  be 
pointed  out  to  encourage  further  work  in  this  direction.  The  most  remarkable 
publication  in  this  respect,  without  doubt,  has  been  Bechhofer,  Kiefer  and 
Sobel  [1],  which  still  serves  as  an  inspiring  source  of  results  and  ideas. 

Let  the  populations,  as  usual,  be  denoted  by  ^  ,. ..  .ir^.  From  every  it . , xa 
sequence  of  observations  is  available  to  the  experimenter.  The 

observations  altogether  are  assumed  to  be  independent.  For  every  i,  let  the 
X^j's  have  a  density  f0  w.r.t.  a  sigma-finite  measure  p  on  IR  ,  which  is  the 
Lebesgue-  or  a  counting  measure,  respectively.  The  family  of  densities 
3  -  {f0}Q€{J,  ft  c  3R  ,  is  assumed  to  be  known.  In  many  papers,  J  is  a  one- 
parameter  exponential  family.  In  the  continuous  case,  the  most  prominent 

example  consists  of  k  normal  populations  with  unknown  means  e1 . €  ft  =  IR 

2 

and  a  common  known  variance  a  >  0  (Normal  Case).  In  the  discrete  case,  it 
consists  of  k  Bernoulli  populations  with  unknown  success  probabilities 
€  n  s  [0,1].  (Bernoulli  Case). 

♦Reserach  supported  by  ONR  Contract  N00014-75-C-0455  at  Purdue  University. 
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A  multi-stage  selection  procedure  for  finding  that  with  0^  *  max{8j,...,e^} 
consists  of  four  different  types  of  decisions  which  have  to  be  made  anew  at 
each  subsequent  Stage  m  ■  1,2 .  More  precisely,  at  Stage  m,  based  on  the  ob¬ 

servations  drawn  up  to  that  point,  the  experimenter  has  to  decide 

(a)  whether  or  not  he  would  like  to  stop  (stopping  rule); 

(b)  In  case  of  not  stopping:  Which  populations  to  eliminate  from  further  consid¬ 
erations  (elimination  rule),  and  what  kind  of  observations  to  take  at  the 
next  following  Stage  uh-1  (sampling  rule) ; 

(c)  In  case  of  stopping:  Which  population(s)  finally  to  select  (terminal 
decision  rule) . 

Procedures  can  be  categorized  in  different  ways  according  to  their  character¬ 
istics  in  (a),  (b)  or  (c),  respectively.  Some  of  the  terms  which  are  used  frequent¬ 
ly  are  the  following. 

Closed  (open)  sequential  procedure:  The  number  of  observations  which  can  be  drawn 
from  it1,...,Trk  is  a  bounded  (unbounded)  random  variable. 

Truncated  procedure:  The  number  of  stages  is  a  bounded  random  variable. 
q-stage  procedure:  The  number  of  stages  is  a  fixed  constant  q  >_  1. 

Procedure  with  elimination:  At  every  stage,  populations  (which  appear  to  be 
inferior)  can  be  eliminated  from  terminal  decisions.  The  remaining  populations 
constitute  a  subset  selection.  Typically,  eliminated  populations  are  excluded 
also  from  further  sampling.  A  justification  for  this  will  be  given  later 
(cf.  Fact  5). 

Vector  at  a  time  sampling:  At  every  Stage  m,  exactly  nm  observations  are  taken 
from  every  non-el i mi nated  population.  The  sample  sizes  n^^,...  are  determined 
before  the  experiment  starts. 

Adaptive  sampling:  At  every  stage,  the  decision  which  observations  to  be  taken 
next  depends  on  the  data  collected  up  to  that  stage.  A  well  known  example  for  the 
Bernoulli  Case  is  the  "play  the  winner  sampling  rule"  which  is  due  to  H.  Robbins 
(cf.  [16],  p.  64). 
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Subset  selection  procedure:  The  final  decisions  are  subsets  of  of 

random  size.  A  correct  selection  ( CS)  occurs  if  the  best  population  is  Included. 
Fixed  size  t  subset  selection  procedure:  The  terminal  decisions  are  subsets  of 
of  fixed  size  t.  In  the  case  of  t  *  1,  one  calls  such  a  procedure 
simply  a  selection  procedure.  A  correct  selection  has  the  same  meaning  as  before. 

All  of  the  terms  above  are  used  quite  consistently  in  the  literature,  except 
for  the  term  adaptive.  Some  authors  call  their  procedures,  though  employing  vec¬ 
tor  at  a  time  sampling,  adaptive  because  of  certain  other  reasons.  See  for  example 
Biiringer,  Martin  and  Schriever  [5]  (the  nonparametric  part),  Hsu  and  Edwards  [19] 
and  Tong  [48]. 


The  classical  approach  to  find  reasonable  procedures  is  the  following.  Let 
k  k 

n  (6*)  =  (0  €  n  |D(0[k_1],  e[|<])  1  <$*}  be  the  so-called  preference  zone  and 
k  k 

n  \n  (6*)  be  the  indifference  zone,  where  D  is  a  distance  measure  (usually  to  be 

6[k]  "  e[k-l ]  or  e[k]/e[k-l]’  resP-)’  6*  >  0  is  fixed  and  e[l]  -  e[2]  !•••£  e[k] 
denote  the  ordered  values  of  e1 . e^.  In  the  indifference  zone  approach,  due  to 

R.  E.  Bechhofer  (cf.  [16],  p.  8),  only  those  procedures  are  considered  which  satisfy 


(1) 


inf{P  (CS)  |  e  6  S2K(s*)}  >  P*, 


-1  k 

where  P*  >  k  is  prespecified.  A  8  €  n  (<5*)  is  called  a  least  favorable  configura¬ 
tion  (LFC)  if  the  infimum  in  (1)  occurs  at  this  0.  A  first  step  towards  establish¬ 
ment  of  (1)  for  a  suitable  type  of  procedure  thus  is  usually  to  find  its  LFC. 

Among  those  procedures  satisfying  (1)  one  then  naturally  tries  to  find  a  candidate 
with  a  small  expected  terminal  subset  size  and,  moreoever,  with  a  small  average 
sample  number  (ASN). 

From  a  decision  point  of  view,  this  (minimax  type)  approach  means  that  the 
risk  w.r.t.  a  0-1  terminal  decision  loss  should  be  less  than  1-P*  on  a  (6*)  and 
that,  subject  to  this  condition,  objective  functions  (expected  terminal  subset 
size  and  ASN)  are  tried  to  get  small.  One  major  objection  to  the  indifference 


zone  approach  is,  however,  that  nothing  is  actually  controlled  if  e  f  n  (6*). 
Procedures  based  on  this  approach  will  be  the  main  topic  of  Sections  3*5. 

An  alternative  way  of  treating  the  problem  is  the  one  prescribed  by  the 
decision  theory.  Hereby  one  has  to  incorporate  all  losses  due  to  inappropriate 
decisions  and  costs  of  sampling  into  one  (stage-dependent)  loss  function,  and  then 
to  consider  the  risk  function,  i.e.  the  expected  loss,  as  the  objective  function 
which  has  to  be  minimized.  Within  the  class  of  permutation  invariant  procedures, 
i.e.  among  those  which  give  no  apriori  preference  to  any  of  the  k  populations, 
several  optimality  results  can  be  derived,  especially  in  the  Bayesian  approach. 
This  will  be  described  in  Section  2  below. 


2.  The  Decision  Theoretic  Approach.  In  this  section  we  assume  that  3  is  a 

one-parameter  exponential  family.  More  precisely,  let 

(2)  3  =  {c(e)exp(ex)d(x),  x€  ®>e6n»  where  ac  R  is  an  interval. 

We  consider  the  class  Kj,  say,  of  permutation  invariant  sequential  procedures 

with  (or  without)  elimination,  which  are  based  on  vector  at  a  time  sampling. 

Let  W.  =  X.,  +  X.0  +...+  X./  .  x  denote  the  sufficient  statistic  for  e., 
im  ll  12  i  v n i+ . .  •+nfn;  < 

based  on  all  observation  which  are  available  from  up  to  Stage  m,  i  *  l,...,k, 

and  let  W^  =  »  ro  =  1»2»...  . 

Let  Lm( e , (t-j , . .. , tm+1 ) )  be  the  loss  which  occurs  at  e  €  nk  if  a  procedure 

stops  at  Stage  m  and  finally  selects  populations  t^  c  {ir^ ,. . . .n^},  after  it  has 

eliminated  at  Stage  j  populations  tj  c  ,. . . ,irk>,  j  *  l,...,m,  where  t^,...,^ 

is  a  disjoint  decomposition  of  . .  Assume  that  for  every  m»  Lm  has  the 

following  properties: 

(3a)  Lrr/-’  =  Lm^0o(l)  *  ,0o(k)^  ’  (tl**",tnW-l^* 

where  a(tj  =  {a(i)|i  6  t.},  j  =  l,...,m+l,  for  every  permutation  a  of 

J  J 
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(3b)  Lm(e,  (tj »•  •  •  »^nrfi))  -  Lm(§»  U7 »•  •  •  *^7)) »  for  a  certain  pair  (i,j) 

with  0^  <_  @j  there  exist  integers  «  <  e  m+1  such  that  i  €  tg,  j  €  ta, 

i  -  (t  \{j>)  U  {1},  t.  =  ( t A { 1 } )  U  {j}  and  t  =  t  for  y  f  cx.e. 
a  a  p  p  y  y 

Condition  (3a)  states  that  Lm  is  permutation  invariant,  and  (3b)  states  that 
a  better  population  should  be  eliminated  later  than  an  inferior  one.  It  is  not 
difficult  to  see  that  -Lm,  if  Lm  has  these  two  properties,  can  be  represented 
by  a  function  of  e  and  a  permutation  (o(l),...,o(k))  of  (l,...,k)  which  is  decreas¬ 
ing  in  transposition  (OT).  Functions  with  this  property  have  been  studied  by 
Hollander,  Proschan  and  Sethuraman  [18],  and  their  results  can  be  used  to  derive 

several  optimality  results  in  the  present  context. 

(/ 

Since  the  risk  function  R(e,  p),  0  £  a  ,  of  a  procedure  p  from^j  is  permu¬ 
tation  symmetric  in  e,  uniformly  (in  0)  best  results  in  terms  of  the  risk  function 
can  be  derived  more  easily  in  a  Bayesian  approach  under  a  permutation  symmetric 
prior  t.  Thus,  let  t  be  such  a  prior  for  the  now  random  parameter  vector  0.  Then 
one  can  prove,  one  after  another,  the  following  facts  (cf.  Gupta  and  Miescke 
[12,  14,  15]).  Let  m  >  1  be  fixed  in  the  sequel. 

Fact  1.  The  density  of  Wm,  defined  on  IRk  x  Ts  (DT). 

Fact  2.  The  posterior  density  of  0  is  (DT). 

Fact  3.  -E{Lm(e,  (ti ,. . . *^7)) |Wm  =  w},  as  a  function  of  (o(1),...,o(k))  and 
w,  is  (DT).  Hereby,  {o(l ) ,. . .  ,a(q-| ) )  =  ty  {a(q1+l ) ,. . .  ,o(q2)}  =  t2,  and  so 
forth,  for  certain  numbers  ,. . .  ,q[ii|  1  with  q^  <  . . .  =  k- 

A  natural  terminal  decision,  at  Stage  m,  selects  only  those  populations 
among  the  non-el iminated  ones  which  yield  the  largest  Wim-values.  In  the  dis¬ 
crete  case,  ties  have  to  be  split  at  random  to  get  a  procedure  withinJCj.  For 
one-stage  procedures  it  is  well  known  that  this  type  of  decision  is  optimum  in 
several  senses  (cf.  [16],  p.  42  and  Miescke  [30]).  The  next  result  can  be  con¬ 
sidered  as  a  generalization  of  the  so-called  "Bahadur -Goodman  Theorem"  (cf.  [16], 
p.  46). 
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Fact  4 .  For  any  p  €  ,  Jet  i1*  be  the  same  procedure  as  p  except  for  the 

terminal  decisions  where  p*  uses  the  natural  ones.  Then 
(4)  R(e,  p*)  <  R(e,  p) ,  uniformly  in  e  €  ft  . 

Actually,  Fact  4  remains  unchanged  if  one  assumes  that,  at  every  Stage 
m,  the  complete  vector  Wm  has  been  observed.  With  other  words,  one  can  state 
Fact  5.  Observations  from  eliminated  populations  are  irrelevant  for  terminal 
decisions. 

The  next  following,  rather  negative,  statement  may  be  considered,  in  certain 
situations,  as  an  argument  against  the  use  of  adaptive  sampling  techniques. 

Fact  6.  For  all  situations  where  terminal  decisions  of  a  procedure  with  adaptive 
sampling  are  based  on  unequal  numbers  of  observations  from  the  non-el i mi nated 
populations,  there  is  no,  uniformly  in  e,  optimal  terminal  decision. 

One  might  now  expect  that,  within  stages  where  a  procedure  with  elimination 
does  not  stop,  natural  subset  selections  (i.e.  subset  selections  associated  with 
largest  W^-values)  have  similar  strong  optimality  properties  as  the  natural  terminal 
decisions.  It  turns  out,  however,  that  this  is  only  the  case  under  certain  circum¬ 
stances.  First  of  all,  results  analogous  to  Fact  4  can  be  proved  only  for  strongly 
unimodal  exponential  families  3,  i.e.  where  f0(x)  or  d(x),  respectively,  is  log- 
concave.  The  following  is  a  key-lemma. 

Fact  7.  If  3  is  strongly  unimodal,  then  the  conditional  density  of  W^-j ,  gi ven 
Wm  =  w,  which  is  derived  from  the  joint  distribution  of  0,  Wm  and  ,  jjs  ( DT) . 

If  one  assumes  that  at  every  Stage  m,  all  observations  Wm  are  known  (just 
to  simplify  the  proofs),  then  with  backward  induction  it  is  not  possible  to  over¬ 
come  the  points  where  decisions  have  to  be  made  on  how  many  populations  to  elimi¬ 
nate.  Optimal  decision  would  utilize  here  al 1  observations,  including  those  from 
already  eliminated  populations.  The  following  three  results,  however,  can  at  least 
be  proved. 


| 
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Fact  8.  Let  the  number  of  stages  q,  say,  as  well  as  all  subset  sizes  of 

selections  r^  >.  rq,  say,  at  Stages  1,2 . q  be  fixed.  Then  the  procedure 

which  uses  the  natural  subset  selections  and  natural  terminal  decision  is  the  unique, 
uniformly  in  e,  best  procedure  in  the  sub-class  of  procedures  in. with  these 
properties. 

Fact  9.  Within  the  sub-class  of  fixed  size  t  two-stage  procedures  in  ,  the 
procedures  which  use  natural  subset  selections  at  Stage  1  and  the  natural  terminal 
decision  at  Stage  2  constitute  an  essentially  complete  class. 

Fact  10.  Assume  that  Lm  depends  on  e  only  through  those  's  which  are  associated 
with  the,  at  Stage  m,  not  eliminated  populations,  m  =  1,2,...  .  If  apriori, 

©1 , . . .  ,0^  are  independently  identically  distributed,  then  every  truncated  Bayes 
procedure  inft^  uses  natural  subset  selections  at  all  stages,  and  the  natural 
terminal  decision. 

After  these  considerations  under  rather  mild  assumptions  upon  the  loss  func¬ 
tions,  it  is  now  natural  to  look  for  specific  procedures  which  are  optimal  in 
more  concrete  situations.  In  the  control  case,  where  one  wishes  to  select  the 
best  population,  provided  that  it  is  better  than  the  control  (i.e.  >  eQ), 

two-stage  procedures  with  elimination  have  been  derived  by  Miescke  [31]  and 
Gupta  and  Miescke  [14].  A  r-minimax  approach,  like  the  one  discussed  by  Miescke 
[32]  for  one-stage  procedures,  appears  to  be  appropriate  for  such  problems,  but  no 
work  has  been  done  here  up  to  now. 

The  difficulties  arising  in  concrete  problems  with  the  backward  induction 
in  sequential  selection  problems  are  considerable  (cf.  Edwards  [8]).  It  seems  to 
be  more  realistic  for  future  work  to  look  for  simple  structured  procedures  which 
are  approximately  optimal  in  a  reasonable  sense.  Ad  hoc  procedures  like  the  ones 
proposed  by  Washburn  [52]  may  have  good  performance  characteristics  and  deserve 
to  be  studied  in  more  detail.  Washburn's  procedures  are  open  and  closed  procedures 
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without  elimination.  They  are  based  on  prior  knowledge  and  use  adaptive  sampling 
which,  as  well  as  the  terminal  decision,  is  based  on  the  posterior  expectations 
of  Op.,.,0^.  These  procedures  are  not  Bayes  solutions  in  a  decision  theoretic 
sense  since  no  overall  loss  is  actually  considered  here.  However,  as  it  is  shown 
by  Washburn  [52],  they  perform  well  compared  with  other  procedures  given  in  the 
literature. 

Two  papers  which  give  more  detailed  Bayes  solutions  in  concrete  problems, 
but  which  do  not  completely  fit  into  the  distributional  framework  considered  so 
far,  are  due  to  Ramey  and  Alam  [41]  and  Gulati  [9].  Ramey  and  Alam  [41]  derive 
a  Bayes  truncated  procedure  for  the  most  probable  of  k  cells  in  a  multinomial 
model  under  a  Dirichlet  prior.  Gulati  [9]  considers  the  problem  of  finding  that 
one  of  k  uniform  distributions  which  has  slipped  to  the  left  (and  has  a  shorter 
support).  He  finds  the  Bayes  solution  with  respect  to  any  prior  specifying  the 
slipped  population,  within  the  class  of  closed  sequential  procedures  based  on 
adaptive  sampling. 

In  the  sections  to  come,  procedures  will  be  discussed  which  are  not  derived 
from  the  decision  theoretic  approach.  Most  of  them  are  fixed  size  t  (especially 
t  =  1)  subset  selection  procedures.  All  of  them  use  the  natural  subset  selections 
within  stages  and  the  natural  terminal  decisions. 

3 .  The  Bernou  1 1  i  Case._and_RgJatedjr .Models,.  The  case  of  k  Bernoulli  populations 
with  unknown  success  probabilities  e^,...,0|<  G  fl  =  [0,1]  will  be  discussed  only 
briefly,  since  two  detailed  publications  on  recent  developments  in  this  area  are 
readily  available.  The  first  one  is  Buringer,  Martin  and  Schriever  [5].  In 
their  book,  various  sequential  selection  procedures  with  vector  at  a  time  as  well 
as  play  the  winner  sampling  and  different  stopping  rules  are  studied  under  the 
indifference  zone  approach  in  great  detail.  The  second  one  is  the  paper  by 
Bechhofer  and  Kulkarni  [2]  which  provides  an  excellent  overview  not  only  over  the 
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k-population  selection  problem  but  also  over  related  areas  like  clinical  trials 
and  multi-armed  bandit  problems,  where  the  objective  functions  differ  from  those 
in  the  selection  problem.  These  related  areas  are  also  covered  by  Dudewicz  and 
Koo  [7],  where  further  references  can  be  found. 

The  main  topic  of  Bechhofer  and  Kulkarni  [2],  however,  is  to  propose  a  closed 
non-eliminating  adaptive  fixed  size  t  subset  selection  procedure,  which  has  several 
optimal  properties  in  terms  of  the  P(CS)  and  the  expected  number  of  observations 
taken  from  certain  populations.  Some  results,  which  are  proved  for  k  =  2  only, 
are  conjectured  to  hold  also  for  k  ^  3.  In  a  subsequent  paper,  Bechhofer  and 
Kulkarni  [3]  provide  additional  results  on  the  performance  of  their  procedure. 

Levin  and  Robbins  [27]  consider  an  open  non-eliminating  procedure  with  vector 
at  a  time  sampling  which  stops  if  one  population  has  r  more  "successes"  than  all  the 
remaining  ones.  Among  others,  a  conjecture  is  stated  for  a  procedure  with  elimina¬ 
tion,  which  is  proved  to  hold  for  the  non-eliminating  version. 

The  related  multinomial  case  (note  that  here  independence  of  the  cell- 
frequencies  is  clearly  not  given,  but  that  the  joint  distribution  is  (DT)),  where 
the  goal  is  to  find  the  cell  with  the  largest  probability,  is  also  treated  by 
Levin  and  Robbins  [27].  A  closed  sequential  (inverse  sampling)  version  of  their 
procedure  has  been  studies  already  by  Ramey  and  Alam  [40],  where  a  conjecture 
concerning  the  LFC  for  k  >  3  is  stated.  A  Bayes  procedure  by  Ramey  and  Alam  [41] 
has  been  mentioned  already  in  Section  2.  Further  work  has  been  done  by  Hwang  [24] 
and  Hsuang,  Hwang  and  Parnes  [22],  The  latter  is  actually  a  one-stage  result,  but 
the  question  of  whether  more  sampling  is  more  informative  is  certainly  of  rele¬ 
vance  for  sequential  selection  problems,  too. 


irmai  case  and  Related  models.  The  problem  of  finding  that  one  of  k 
normal  populations  N^.o^),  i  =  l,...,k,  which  has  the  largest  mean  in  more  than 
one  stage  has  been  studied  under  several  aspects  (but  mainly  under  the  indifference 
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zone  approach)  by  many  research  workers.  In  the  following  we  shall  distinguish 

between  three  different  situations  depending  on  the  status  of  knowledge  about 

2  2  2  2  2  2 
cj  , , . . .  ,<j^ .  The  first  one  is  the  simplest  one:  Here,  =...=  =  a  and  a  >  0 

is  known.  For  this  model,  Bechhofer,  Kiefer  and  Sobel  [1]  proposed  and  studied 

their,  meanwhile  classical,  open  sequential  procedure  without  elimination  which 

is  based  on  vector  at  a  time  sampling  with  1  =  n^  =  n^  =...  .  The  terminal  decision 

is  the  natural  one  (which  is  optimum,  cf.  Fact  4),  and  the  stopping  rule  is 


(4)  nbks  =  inftalij  «PM*(Y[10m-Ytl.]m)>  <  0-P*)/P*> 

where  Yjn]  =  (Xj,  +...+  XJ/o2,  i  =  1 . k, 

which  establishes  the  procedure  in  the  indifference  zone  approach  (i.e.  (1)).  An 
upper  bound  for  the  first  moment  of  NgK<-  is  given  in  Huang  [23],  and  asymptotic 
properties  of  this  stopping  rule,  i.e.  the  behavior  of  the  ASN  under  P*  -►  1  and/or 
6*  ->■  0  have  been  studied  by  Bechhofer,  Kiefer  and  Sobel  [1],  Tong  [49]  and  Jennison, 
Johnstone  and  Turnbull  [25]. 

As  with  the  BKS-procedure,  many  others  can  be  viewed,  in  one  way  or  another, 
as  being  generalizations  of  Wald's  SPRT  (cf.  [16],  p.  127).  This  is  also  the  case 
with  that  one  in  Mukhopadhyay  [37].  It  differs  from  the  BKS-procedure  only 
through  its  stopping  rule,  N^,  say. 

(5)  Nm  =  inf{m|(k-l)max  exp(-6*( Y |-k-jm~Y j-i -jfn) )  <  1-P*}. 

k 

Apparently,  NgKS  <  and  E0(NgKS)  <  EQ(NM)  ^or  ®  €  3R  »  which  implies  that 
the  BKS-procedure  is  more  efficient.  Usually  it  is  more  difficult  to  compare 
procedures  directly  in  this  manner  and  asymptotic  techniques  are  then  the  only 
way  to  do  this. 

Under  the  indifference  zone  approach,  open  sequential  procedures  with 
elimination  and  vector  at  a  time  sampling,  which  are  based  on  paired  comparisons 
of  the  sample  means,  are  discussed  by  Swanepoel  and  Geertsema  [44],  Kao  and  Lai 
[26],  Hsu  and  Edwards  [20],  Turnbull,  Kaspi  and  Smith  [50]  and  Jennison,  Johnstone 
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and  Turnbull  [25],  In  the  latter  two  papers,  however,  procedures  with  adaptive 
sampling  are  the  main  topic. 

An  open  sequential  procedure  without  elimination  based  on  vector  at  a  time 

sampling  is  proposed  by  Tong  [48]  in  a  more  general  setting  including  the  normal 

case.  To  avoid  the  indifference  zone  approach,  the  single  stage  P(CS)  is  hereby 

estimated  stage  by  stage  (by  replacing  the  unknown  parameters  by  estimators),  and 

the  procedure  stops  as  soon  as  this  estimate  is  greater  or  equal  to  P*.  It  would 

be  interesting  to  derive  a  probability  guarantee  for  the  P(CS)  in  n  (6*).  In 
k  k 

q  \  o  (6*),  however,  one  has  to  encounter  (similar  as  with  the  BKS-procedure)  a 
large  ASN  (cf.  McCulloch  [28]). 


Even  if  the  ASN  is  finite,  there  remains  the  uncertainty  of  how  long  it 
actually  takes  until  an  open  procedure  eventually  stops  (see  also  Bechhofer  and 
Kulkarni  [2],  2.2).  From  a  practical  point  of  view,  truncated  and  q-stage  proce¬ 
dures  with  elimination  seem  to  meet  more  likely  the  needs  of  practioneers.  A 
two-stage  procedure  with  elimination  and  vector  at  a  time  sampling  is  proposed  by 
Tamhane  and  Bechhofer  [47].  The  elimination  hereby  is  made  by  means  of  Gupta's 
maximum  means  rule  (cf.  [16],  p.  232): 

(6)  Select  if  Yj  >  Y^^  -  d,  i  =  l,...,k, 

where  d  =  d(6*,P*,  n-| ,  n2)  in  the  present  context.  A  multi-stage  procedure 
which  is  a  direct  generalization  of  Tamhane  and  Bechhofer's  procedure  is  proposed 

by  Tamhane  [46].  . 

Several  optimality  properties  of  Gupta's  subset  selection  rule  have  been 
pointed  out  recently  by  Gupta  and  Miescke  [11],  Miescke  [29],  Gupta  and  Kim  [10] 
and  Bickel  and  Yahav  [4].  Thus  the  use  of  this  rule  at  the  first  stage  is  intui¬ 
tively  justified.  No  theoretical  results,  however,  which  support  this  idea  are 
known  at  present. 
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In  a  conservative  approach,  Tamhane  and  Bechhofer  [47]  use  a  lower  bound 
for  the  P(CS)  to  find  a  most  economical  pair  (n^.ng)  in  the  indifference  zone 
approach.  Their  conjecture  that  the  slippage  configuration  e^-j  *...*= 

=  8[k]  '  *ias  been  proved  to  be  correct  for  k  =  3  by  Miescke  and 

Sehr  [33].  The  case  of  k  >  3  is  still  unproved.  Techniques  for  finding  LFC's 
as  well  as  results  for  other  procedures  are  given  in  Gupta  and  Miescke  [13]. 

For  the  control  problem,  several  two-stage  procedures  are  considered  and 
discussed  in  Gupta  and  Miescke  [14]  and  Miescke  [31]. 

2  2  2  2 

The  second  situation,  where  still  =...=  =  a  ,  but  a  >  0  is  now 

unknown,  is  also  considered  by  Kao  and  Lai  [26]  and  Jennison,  Johnstone  and 
Turnbull  [25]. 

Open  sequential  procedures  without  elimination  based  on  vector  at  a  time 
sampling  are  studied  by  Wackerley  [51]  (in  a  more  general  approach)  and  by 
Mukhopadhyay  and  Chou  [38].  The  latter  perform  a  similar  analysis  as  before 
Mukhopadhyay  [37]  has  done  before.  Mukhopadhyay  [36]  deals  with  the  case  of  k  =  2. 

2  2 

The  third  situation,  where  a-|,...,ak  are  unknown  and  possibly  unequal, 
is  the  most  difficult  one.  It  is  questionable,  however,  whether  it  is  still 
reasonable  to  look  for  a  population  with  the  largest  mean  which  perhaps  might 
also  have  the  largest  variance. 

Two-stage  procedures  without  elimination,  based  on  the  classical  Stein- 
technique  (cf.  [16],  p.  23)  to  determine  the  common  sample  size  at  Stage  2  in 
dependence  of  the  estimated  variances  at  Stage  1»  are  considered  by  Rinott  [42] 
and  Mukhopadhyay  [34]  under  the  indifference  zone  approach.  A  three-stage  proce¬ 
dure  employing  the  Stein  approach,  followed  by  elimination  via  Gupta's  rule  at 
Stage  2,  is  proposed  by  Hochberg  and  Marcus  [17]  in  the  Indifference  zone  approach. 
For  the  case  of  k  =  2  populations,  other  procedures  have  been  considered  by 
Mukhopadhyay  [35].  Procedures  are  also  given  by  Swanepoel  and  Geertsema  [44]. 


For  the  control  case,  open  and  closed  sequential  procedures  with  elimination 
and  vector  at  a  time  sampling  are  proposed  by  Hsu  and  Edwards  [19]. 

5.  Other  results.  For  the  problem  of  finding  the  normal  population  with  the 
smallest  variance,  Mukhopadhyay  and  Chou  [39]  give  an  open  sequential  procedure 
without  elimination  based  on  vector  at  a  time  sampling.  The  basic  construction 
is  the  same  as  before  in  Mukhopadhyay  [37].  For  the  linear  regression  model, 
another  type  of  sequential  procedure  is  proposed  by  Hsu  and  Huang  [21]. 

Finally,  several  papers  dealing  with  nonparametric  sequential  procedures 
are  presented  by  Swanepoel  and  Venter  [45],  Swanepoel  [43],  Carroll  [6], 

Biirlnger,  Martin  and  Schriever  [5]  and  Hsu  and  Edwards  [20]. 
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